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■ Abstract 

Sh 1 Using a new analysis approach, we establish a general convergence theory of the Shift- 

| ^ . Invert Residual Arnoldi (SIRA) method for computing a simple eigenvalue nearest to a 

given target a and the associated eigenvector. In SIRA, a subspace expansion vector at 
each step is obtained by solving a certain inner linear system. We prove that the inexact 
SIRA method mimic the exact SIRA well, that is, the former uses almost the same outer 
iterations to achieve the convergence as the latter does if all the inner linear systems 
are iteratively solved with low or modest accuracy during outer iterations. Based on 
. the theory, we design practical stopping criteria for inner solves. Our analysis approach 

applies to the Jacobi-Davidson (JD) method with the fixed target a as well, and a similar 
general convergence theory is obtained for it. Numerical experiments confirm our theory 
and demonstrate that the inexact SIRA and JD are similarly effective and are considerably 
superior to the inexact SIA. 

Keywords. Subspace expansion, expansion vector, inexact, low or modest accuracy, 
(*C) ■ the SIRA method, the JD method, inner iteration, outer iteration. 
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in 

i/-) ; 1 Introduction 

OV 

" Consider the large and possibly sparse matrix eigenproblem 

Ax = Ax, 

^ _ with A 6 C nxn , the 2-norm ||x|| = 1 and the eigenvalues labeled as 

c5 \ < |Ai — a\ < \X 2 — a\ < ■ ■ ■ < \\ n - a\ 

for a given target a G C. We are interested in the eigenvalue Ai closest to the target a 
and/or the associated eigenvector Xi. We denote (Ai,xi) by (A,x) for simplicity. A number 
of numerical methods [2],ll4 |. n~5] . l20| . l21 j are available for solving this kind of problems. The 
Residual Arnoldi (RA) method and Shift-Invert Residual Arnoldi (SIRA) method are new 
ones that have their origins in the Jacobi-Davidson (JD) method [18]. RA was initially 
proposed by van der Vorst and Stewart in 2001; see a description in jllj . The methods were 
then studied and developed by Lee [10] and Lee and Stewart [TTJ. We briefly describe RA 
now. Given a starting vector vi with ||vi|| = 1, suppose an orthonormal V m = (vi, . . . , v m ) 
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has been constructed by the Arnoldi process. Then the columns of V m form a basis of the Tri- 
dimensional Krylov subspace /C m (A, vi) = span{yi, Avi, . . . , A m_1 vi}, and the next basis 
vector v m+ i is obtained by orthogonalizing Av m against V m . Let (A,y) be the candidate 
Ritz pair of A for a desired eigenpair of A with respect to /C m (A, vi), and define the residual 
r = Ay — Ay. Then the RA method orthogonalizes r against V m to get the next basis 
vector, which, in exact arithmetic, is just v m+ i obtained by the Arnoldi process. So the 
Arnoldi method is mathematically equivalent to the RA method. However, van der Vorst 
and Stewart discovered a striking phenomenon that the RA method exhibits a more robust 
convergence characteristic under perturbations in r than the Arnoldi method does in Av m . 

The Shift-Invert Arnoldi (SIA) method is just the Arnoldi method applied to the shift- 
invert matrix B = (A — <rl) -1 and finds A nearest to a and the associated x. It computes 
v. m+ i by orthogonalizing u = Bv m against V m , whose columns are now a basis of /C m (B, vi). 
So at step m one has to solve the linear system 

(A - al)u = v m . (2) 

The SIRA method jlO|ll lj is an alternative of the RA method applied to B. It is an orthogonal 
projection or Rayleigh-Ritz method that, like SIA, computes the desired eigenpair (A,x) of 
A. In the SIRA method, at each step one has to solve the inner linear system 

(A - crl)u = r, (3) 

where r = Ay — uy is the residual of the current approximate eigenpair (v, y) obtained by 
SIRA. Then it computes v m+ i by orthogonalizing u against V m and expands fC m (B, vi) to 
/C m+ i(B,vi). A difference between SIA and SIRA is that the SIA method computes Ritz 
pairs of B with respect to /C m (B, vi) and recovers an approximation to (A, x), while the SIRA 
method computes the Ritz pairs of A with respect to /C m (B, vi) and gets an approximation 
to (A,x). So SIA and SIRA obtain similar but different approximations to (A,x) with respect 
to the same subspace /C m (B, vi). 

Since ([3]) is large, only iterative solvers are generally viable. This leads to the inexact 
SIRA, an inner-outer iterative method, built-up by outer iteration as the eigensolver and 
inner iteration as the solver of ([3]). Inexact eigensolvers have attracted much attention over 
years, and among them inexact SIA type methods [U [El El [23] are closely related to the 
work in the current paper. A common research focus on all inexact eigensolvers is how the 
accuracy of inner iterations affects the convergence of outer iterations. 

The JD method with fixed or variable targets [18] is also an inexact eigensolver, in which 
a correction equation is solved iteratively at each outer iteration; see, e.g., the books [2|l20p 21| 
and more recent works [H ll3l[T9[ l22] . Since it is very hard to directly analyze the convergence 
of the standard JD method, one instead considers that of the simplified (or single- vector) JD 
method without subspace acceleration, in which the next approximate eigenvector is obtained 
by adding an approximate solution of the inner linear system to the current approximate 
eigenvector. As stated in the literature, the standard JD method is expected to be faster 
than the simplified JD method. So one hopes that the results on the accuracy requirement 
on inner iterations developed for the simplified JD may be seen as the worst ones for the 
standard JD. Nevertheless, this treatment may be problematic or too rough. On the one 
hand, since the standard JD is a Rayleigh-Ritz method, its convergence is not guaranteed 
even though projection subspace is accurate enough; see [9], also [2 | l20 p 21| for details. On the 
other hand, the standard JD should generally produce more accurate approximate eigenpairs 
than the simplified JD. Therefore, the standard JD method itself lacks a general theory of 
inner iterations, and a rigorous and more insightful analysis is obviously necessary. 

For the inexact SIA method, Simoncini [17] has established a relaxation theory on the 
accuracy of approximate solution of |2} as m increases. She proves that the accuracy of 
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approximate solution of ([2]) should be very high initially and then it can be relaxed as the 
approximate eigenpairs start converging. Freitag and Spence [3] have extended Simoncini's 
relaxation theory to the inexact implicitly restarted Arnoldi method. Xue and Elman |23| 
have made a refined analysis on the relaxation strategy for inner solves and on a special 
preconditioner with tuning in the inexact implicitly restarted Arnoldi method. As the results 
in these papers have indicated, the inexact SIA type methods have a common feature that 
one has to solve inner linear systems with high accuracy when approximate eigenpairs are of 
poor accuracy and then solves them with decreasing accuracy as the approximate eigenpairs 
converge. So it may be very costly to implement the inexact SIA type methods. 

For the SIRA method, it has been reported by Lee [10] and Lee and Stewart [11] that when 
the accuracy of approximate solutions of ([3]) is low or modest at each step, the method may 
still work well. From the viewpoint of Krylov subspaces and exploiting backward perturbation 
theory, Lee and Stewart [llj have analyzed the RA and SIRA methods by considering V m as 
a dynamic Krylov subspace /C m (B + E m , vi) at step m, where E m is a variable perturbation 
matrix with m, whose size cannot estimated. From the analysis and results in |10|.ll lj . it 
appears impossible to derive quantitative and explicit bounds for the accuracy requirement 
on inner iterations. 

In this paper, we take a different and general approach to giving a rigorous analysis of 
the inexact SIRA method and establish a general and quantitative theory of the accuracy 
requirement on inner iterations. Our analysis approach applies to the JD method with the 
fixed target a as well. We first show that the SIRA and JD methods are mathematically 
equivalent when the inner linear system and the correction equation involved in them are 
solved exactly, respectively. We then focus on a detailed quantitative analysis of the SIRA 
and JD methods. Let e be the relative error of the approximate solution of the inner linear 
system. We prove that a fairly small e, e.g., e £ [10 -4 , 10 -3 ], is generally enough to make the 
former ones use almost the same outer iterations as the latter ones to achieve the convergence. 
As a result, one only needs to solve all inner linear systems with low or modest accuracy in the 
SIRA and the JD methods, and both methods are expected to be considerably more effective 
than the inexact SIA method. We consider some issues on practical implementations. 

The paper is organized as follows. In Section [21 we review the SIRA and JD methods 
and show their equivalence when inner linear systems are solved accurately. In Section [3l 
we derive some relationships between e and subspace expansions and show that the inexact 
JD and SIRA methods are essentially equivalent when their respective inner linear systems 
are solved with the same accuracy. In Section [H we consider subspace improvement and 
the selection of e and prove that the inexact SIRA mimics the exact SIRA very well when 
e is fairly small at all steps. In Section [5l we consider some practical issues and design 
practical stopping criteria for inner solves in the inexact SIRA and JD. In Section El we 
report numerical experiments to confirm our theory and the considerable superiority of the 
inexact SIRA and JD algorithms to the inexact SIA algorithm. Meanwhile, we show that the 
inexact SIRA and JD are similarly effective. Finally, we conclude the paper and point out 
future work in Section [71 

Throughout the paper, denote by || • || the 2- norm of a vector or matrix, by I the identity 
matrix with the order clear from the context, by the superscript H the conjugate transpose of 
a vector or matrix, and by k(Q) = ||Q||||Q _1 || the condition number of a nonsingular matrix 
Q. We measure the distance between a nonzero vector y and a subspace V by 



where Py is the orthogonal projector onto V and the columns of V_l form an orthonormal 
basis of the orthogonal complement of V. 



sinZ(V,y) 



11(1 -Pv)y 



y 
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2 Equivalence of the exact SIRA and JD methods 



Algorithms [TH2] describe the SIRA algorithm and the JD algorithm with the fixed target a, 
respectively (for brevity we drop iteration subscript). Comparing them, we observe that the 
only seemingly differences between them are the linear systems to be solved (step 4) and 
the expansion vectors to be orthogonalized against the initial subspace V. In fact, they are 
equivalent, as the following theorem shows. 

Algorithm 1 SIRA method with the target a 

Given the target a and a user-prescribed convergence tolerance tol, suppose the columns 

of V form an orthonormal basis of an initial subspace V. 

repeat 

1. Compute the Rayleigh quotient H = V^AV. 

2. Let {y, z) be an eigenpair of H, where v = A. 

3. Compute the residual r$ = Ay — uy, where {y, y) = {v, Vz). 

4. Solve the linear system 

(A - al)u = r s . (5) 

5. Orthonormalize u against V to get v. 

6. Expand the subspace as V = [ V v ] and update H. 
until ||rs|| < tol. 



Algorithm 2 Jacobi-Davidson method with the fixed target a 

Given the target a and a user-prescribed convergence tolerance tol, suppose the columns 

of V form an orthonormal basis of an initial subspace V. 

repeat 

1. Compute the Rayleigh quotient H = V^AV. 

2. Let {y, z) be an eigenpair of H, where v = A. 

3. Compute the residual rj = Ay — z^y, where {y, y) = {y, Vz). 

4. Solve the correction equation for u _L y, 

(I - yy")(A - - yy")u = -rj. (6) 

5. Orthonormalize u against V to get v. 

6. Expand the subspace as V = [ V v ] and update H. 
until \\rs\\ < tol. 



Theorem 1. For the same initial V, if a ^ v, then the SIRA method and the JD method 
are mathematically equivalent when inner linear systems ([5]) and © are solved exactly. 

Proof. For the same initial V, the two methods share the same H, v and y, leading to the 
same rs and rj. Let us and uj be the exact solutions of ([5]) and ([6]), respectively. Since 
B = (A - cjI)" 1 , we get 

u s = Br s = (<t- u)By + y. (7) 

From ©, we have 

(A - al)uj = (y H (A - al)uj) y - r j = 7 y - (A - al)y , (8) 
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where 7 = y^(A — al)uj — a + v. Premultiplying two hand sides of © by B, we obtain 

uj = 7By - y. (9) 

Since uj _L y, we get 7 = j/ B ■ Since y € V, we have (I — Py)y = 0. So from ((TJ) and ©, 
we get 

(I - P v )By = -J— (I - P v )u s = -(I - P v )uj. (10) 

a — v 7 

Note that (I — Pv)u,s and (I — Py)uj (after normalization) are the subspace expansion 
vectors in SIRA and JD, respectively. The two methods generate the same subspace in the 
next iteration and (v, y) obtained by them are thus identical. □ 

From (jHJ), define 

r 'j = Ay - (cr + 7)y, 



where 



7 = y^(A - al)uj - a + v 



y^By' 

Then flSJ) and thus © become 

(A - <rl)u = r'j, (11) 

whose solution is — uj and is the same as uj up to the sign —1. So mathematically, hereafter 
we use (jlip as the inner linear system in the JD method. Since y^By approximates the 
eigenvalue of B, 7 + a = y j/ By + f approximates A. So r'j is a residual associated with 
the desired eigenpair (A, x), just like in (j^J). 

3 Relationships between the accuracy of inner iterations and 
subspace expansions 

We observe that ((5J) and (jlip fall into the category of 

(A- £ jI)u = aiy + a 2 (A-aI)y, (12) 
where specifically a\ = a — v and = 1 in SIRA and ai = — y i/ By an d a 2 = 1 in JD. The 



exact solution u of ([121) is 

u = aiBy + a 2 y- (13) 

Since (I — Pv)y = 0, the (unnormalized) subspace expansion vector is (I — Py)Bu = 
(I — Pv)By. Let u be an approximate solution of (|12p . whose relative error is defined by 

II u — u|| , 

(14) 



1 1 xx 1 1 

Then we can write 

u = u + e||u||f 
with f the normalized error direction vector. So we get 

(I-Pv)u = (I-Pv)u + £||u||fj_. (15) 

where 

fjL = (I — Pv)f • (16) 
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Define 

JI-Pv)u_ (I-Py)u 

||(I-Pv)u||' ||(I -PvH' 1 J 

which are the normalized subspace expansion vectors in the inexact and exact methods, 
respectively. We measure the difference between (I — Py)u and (I — Py)u by the relative 
error 

||(I-Py)u-(I-Pv)u|| 

e " lid -PvH (18) 

or by sinZ(v, v). Two quantities e and sinZ(v,v) are two valid measures for the difference. 
Next we establish a relationship between e and sinZ(v, v), which will be used in proving our 
final result in this paper. 

Lemma 1. With the notations defined above, it holds that 

sinZ(v, v) = esinZ(v, fj_). (19) 

Proof. Let Uj_ be an orthonormal basis of the orthogonal complement of span {(I — Py)u} 
with respect to C n . Since U^(I - P V )S = 0, by definition © we get 

sinZ(v,v) = sinZ((I-P v )u, (I-Pv)u) 
||Uf(I-P v )u|| 

ll(i -PvH 

||Uf(I-Pv)u-Uf(I-Pv)u|| 



ll(i -PvH 

|Uf ((I-Pv)u-(I-Pv)u) 



(20) 



ll(i -PvH 

From (|15p we have (I — Py)u — (I — Py)u = e||u||fj_. Substituting it into (|20p gives 

sinZ(v, v) = es'm Z(v, fj_). 

□ 

In order to make the inexact SIRA method mimic the SIRA method well, we must require 
that v approximates v with certain accuracy, i.e., e suitably small, so that the two expanded 
subspaces have comparable quality. We will come back to this key point and estimate e 
quantitatively in Section HI 

In what follows we establish an important relationship between e and e, and based on it 
we analyze how e varies with a\ and 02 for a given e. 

Theorem 2. Let y be the current approximate eigenvector and a = — ^ with a±, «2 in (|12p . 
We have 

2||B|| sinZ(y,x) _ 
£ ~ ||By-ay||sinZ(V,f) £ ' (21) 

Proof. By definition (|16H . we have 

HfjJ = ||(I-Pv)f|| =sinZ(V,f). 
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From (fT5|) . we get 



||(I-Py)u-(I-PvH 

l|u||||f±|| 

||(I-P V )U[| [[(I - P V )5 " (I " P V )u|| 

Hl|f±|| 

||(I-P v )u|l 



||(I-Pv)u| 

||(i -PvH 

llulHlfj.ll fc ||u||sinZ(V,f) 
By (fl~3]) , we substitute u = aiBy + a 2 y into the above, giving 

||(I-Pv)(aiBy + a 2 y)|| ; 
||aiBy + a 2 y||sinZ(V,f) 

||ai(I-Pv)By|| 
||aiBy + a 2 y||sinZ(V,f) £ 

||(I-Pv)By|| 



e. 



By + ^y 



sinZ(V,f) 



-£. 



Decompose y into the orthogonal direct sum 

y = cos Z(y, x)x + sin Z(y, x)g 

with glx and ||g|| = 1. Then we get 

(I-Pv)By = (I-P v )(cosZ(y,x)Bx + sinZ(y,x)Bg) 

'cosZ(y,x 



(22) 



(23) 



(I-Pv 

cosZ(y,x 
X-a 



-x + sinZ(y,x)Bg 
A — a 

x ± +sinZ(y,x)(I-P v )Bg, 



where xj_ = (I — Py)x. Making use of ||xj_|| = sinZ(V, x) < sinZ(y,x) and np^i < ||B| 
we obtain 



(I-Pv)By|| 



cosZ(y,x) 



< 



< 



< 



X-a 
cosZ(y,x) 
|A - cr| 
cosZ(y,x)| 



\X-a\ 
1 



x ± + sinZ(y,x)(I-P v )Bg 
■||xjj + ||(I -P V )Bg|| sin Z(y,x) 
+ ||(I-Pv)Bg||)sinZ(y,x) 



|A-<7 

< 2||B||sinZ(y,x). 



+ ||B|| sinZ(y,x) 



Therefore, combining the last relation with (|22|) establishes (|21D , 



(24) 
□ 



Observe that the linear system (A — <rl)u = y, which is also the one in the inverse power 
method at each step, falls into the form of (|12p by taking a\ = 1 and a 2 — 0. For this case, 
from (12111 we have 



g< 2||B||sinZ(y,x) ? 



(25) 



|By||sinZ(V,f) 

We comment that (i) sin Z(V, f) is moderate as f is a general vector and (ii) ||B||/||By | = 0(1) 
if y is a reasonably good approximation to x and in the worst case ||B||/||By|| < k(B). In 



7 



case that sinZ(V, f) is small, e becomes big for a fixed small e, that is, linear system (|12|) is 
allowed to be solved with less accuracy. So a small sinZ(V, f) is a lucky event. 

We can use this theorem to further illustrate why it is bad to solve (A — erl)u = y 
iteratively. For a fixed small e, (|25p tells us that e should become smaller as sinZ(y, x) — s> 
as the algorithms converge. As a result, we have to solve inner linear systems with higher 
accuracy as y becomes more accurate. More generally, this is the case when ||By — ay|| is 
not small and typically of 0(||B||). Therefore, for a = and more general a, the resulting 
method and SIA type methods are similar and no winner in theory. They are common in 
that they all require to solve inner linear systems accurately for some steps and they are 
different in that the former solves inner linear systems with poor accuracy initially and then 
with increasing accuracy as the algorithm converges, while the latter ones solve inner linear 
systems with high accuracy in some initial outer iterations and then with decreasing accuracy 
as the algorithms converge. 

Based on (|2ip . it is natural for us to maximize its upper bound with respect to a for a 
fixed e. This will make e is as small as possible, so that we pay least computational efforts 
to solve (|12p . This amounts to minimizing ||By — ay||. As is well known, the optimal a is 

argmin ||By — oy|| = y By, (26) 

Such a = — ^ corresponds to the choice a\ = — j/ B and «2 = 1 in ([12]) . exactly leading to 
linear system (|lip in the JD method. Therefore, in the sense of minimizing ||By — ay||, the 
JD method is the best. If we take a = pz^, which is the approximation to in SIRA, by 
letting a\ = a — v and a2 = 1, then (1120 becomes 



(A - erl)u = (A - al)y + (cr - u)y = r s , 

which is exactly the linear system in the SIRA method. In each of JD and SIRA, || By — ay|| 
is the residual norm of an approximate eigenpair (a,y) of B. 

In what follows, we denote e by £s and ej in the SIRA and JD methods, respectively. To 
derive our final and key relationships between es, £j and e, we need the following lemma, 
which is direct from Theorem 6.1 of [9] and establishes a close and compact relationship 
between sinZ(y,x) and the residual norm || By — ay||. 



Lemma 2. Suppose ^j^,xj is a simple desired eigenpair of B 6 C nxn and let (x, Xj_) be 
unitary. Then 



n 



X 



B [ x Xj 



L 



(27) 



assume 



where c H = x^BXj_ and L = X^BXj_. Let (a, y) be an approximation to 
that a is not an eigenvalue of L and define 

sep(a,L) = ||(L - al)" 1 !!" 1 > 0. (28) 

Then 

smZ(y,x)< — — . (29) 

sep (a, L) 

Combining (|29p with Theorem [21 we obtain one of our main results. 
Theorem 3. Assume that a is an approximation to and is not an eigenvalue o/L. Then 

2 II B !I 

6 " sep (a,L)sinZ(V,f) £ ' ( } 
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In particular, for a = and a = y^By ; which correspond to the SIRA and JD methods, 
respectively, assume that each of them is not an eigenvalue of~L. Then it holds that 

2 II B II _ , , 

£s < 7— v e, (31) 

sep(^,Lj sinZ(V,f) 

and 

2||B|| 

£J ~ S ep(y^By,L) S mZ(V,f) £ - (32) 

This theorem shows that once e is known we can a-priori determine the accuracy require- 
ments es and ej on approximate solutions of inner linear systems (JSJ) and (|6|). 
It is important to observe from (|30p that 



2||B|| 2jjBjj ^_ 

£ " sep(a,L)sinZ(V,f) £ 0(||B||) £ [£) 

if a is well separated from the eigenvalues of B other than -5-^— and B is normal or mildly non- 
normal and sinZ(V,f) is not small. For sinZ(V, f) small, noting that bound (130j) is compact, 
we are lucky to have a bigger e, i.e., to solve the inner linear system with less accuracy. If 
sep (a, L) is considerably smaller than ||B||, then e may be bigger than e considerably and 
we are likely lucky to solve the inner linear system with less accuracy. 

For the a's in the SIRA and JD methods, by continuity the corresponding two sep (a, L)'s 
are close. Therefore, for a given e, we have essentially the same upper bounds for es and ej. 
This means that we need to solve the corresponding inner linear systems ([5]) and (jSJ) in the 
SIRA and JD methods with essentially the same accuracy e. In this sense, we claim that the 
SIRA and JD methods are essentially equivalent. 



4 Subspace improvement and selection of e and e 

In this section, we first focus on the fundamental problem of how to select e to make the 
inexact SIRA and JD mimic the exact SIRA very well from the current step to the next one. 
Then we show how to achieve our ultimate goal: the determination of e. 

Recall that the subspace expansion vectors are v and v for the exact SIRA and the inexact 
SIRA or JD; see (pH). Define V+ = [ V v ], V+ = span{V + } and V+ = [ V v ], 
V+ = span{ V + }. In order to make the inexact SIRA method mimic the exact SIRA method 
very well, we must require that the two expanded subspaces V+ and V+ have almost the same 
quality, namely, sinZ(V+,x) ~ sinZ(V+,x), whose quantitative meaning will be clear later. 

Theorem 4. With the notations above, assume sin Z(v,x_l) / with xj_ = (I — P v )x0 
Then we have 

sinZ(V+,x) = sin Z(V, x) sin Z(v, xj_), (33) 
sinZ(V+,x) sinZ(v,xj_) 



sinZ(V+,x) sinZ(v,x_|_) 
Suppose Z(v, v) is acute. If r = sinZ -jy Xj j < 1, we have 



(34) 



■-.< "y <lH (35) 
sin Z(V+,xj 



If it fails to hold, it is seen from (|33p that sinZ(V+,x) = and the exact SIRA, SIA and JD methods 
terminate prematurely if dim(V+) < n. In this case, V+ is an invariant subspace of A and we stop subspace 
expansion. We will exclude this rare case. 
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Proof. Since 

sin 2 Z(V , x) — sin Z(V+, x) 
by || x _i_|| — sinZ(V,x) we obtain 

sinZ(V+,x) 
sin Z(V, x) 



fl-Pvlxl 



(I-Pvjx| 



|v"x| 2 , 



Iv^xl 



sin Z(V, x) 



l v ^ x ±| 
sinZ(V,x) 



|xj_ || cos Z(v, X_|_" 
sin Z(V, x) 

a/1 — cos 2 Z(v, xj_) 
sinZ(v,x_L), 



which proves ()33|) . Similarly, we have 

sinZ(V+, x) 



sin Z(V, x) 



smZ(v, xj_, 



Hence, from (133]) and (|36j) . we get (1341) . 
Exploiting the trigonometric identity 

■ //- x ■ // \ rj ^(v,x_l) + Z(v,x_l) . Z(v,x_l] 
sin Z(v, xj_) — sin Z(v, xj_ j = 2 cos sm ■ 



(36) 



2 2 
the angle triangle inequality 

|Z(v,x ± )-Z(v,x ± )| < Z(v,v). 
and the monotonic increasing property of the sin function in the first quadrant, we get 

^(v,x ± ) - Z(v,x ± ) 



sinZ(v,xj_) — sinZ(v, xj_)| < 2 



sm ■ 



2 sin ■ 



-(v,x_l) - Z(v,x ± ) 



< 2 sin 



Z(v,v) 



From (JMD, (Ell) and (HID, we obtain 
sinZ(V+,x) 



sinZ(V+,x) 



1 



< 2sinZ(v,v). 
sinZ(v, xj_) 



(37) 



1 



< 



< 



sin Z(v, xj_) 

|sin Z(v, xj_) — sin Z(v, Xj_) 

sin Z(v, xj_) 
2 sin Z(v, v) 



sinZ(v,x_|J 
2g 

sin Z(v, xj_) 



from which it follows that (1351) holds. 



□ 
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Prom (f33|) . we see that sinZ(v, x_|_) is exactly one step subspace improvement when V is 
expanded to V+. 

(|35p shows that, to make sinZ(V+,x) ~ sin Z(V+, x), r should be small. Meanwhile, (|35p 
also indicates that a very small r cannot improve the bounds essentially. Actually, for our 
purpose, a fairly small r, e.g., r = 0.01, is enough since we have 

0.99 < SinZ ^+' X ; < 1.01 
sinZ(V+,x) 

and the lower and upper bounds are very near and differ marginally. Therefore, V+ and 
V+ are of almost the same quality for approximating x. As a result, it is expected that the 
inexact SIRA or JD computes new approximation over V+ to the desired (A,x) that has 
almost the same accuracy as that obtained by the exact SIRA over V+. More precisely, the 
accuracy of the approximate eigenpair by the exact SIRA and that by the inexact SIRA or 
JD are generally the same within roughly a multiple c € [1 — r, 1 + r] (this assertion can 
be justified from the results in [HJ[9]). So how near the constant c is to one is insignificant, 
the inexact SIRA and JD generally mimic the exact SIRA very well when r is fairly small. 
Concisely, we may well draw the conclusion that r = 0.01 makes the inexact SIRA mimic 
the exact SIRA very well, that is, the exact and inexact SIRA methods use almost the same 
outer iterations to achieve the convergence. 

Next we discuss the selection of e. Once e is available, in principle we can exploit compact 
bounds (|3ip and (|32p to determine the accuracy requirements £s and £j on inner iterations 
in the SIRA and JD. 

From the definition of r, we have 

~ T 

e = -sinZ(v,x_L). (38) 

As Theorem [5] requires r < 1, we must have e < \ sinZ(v, xj_). But xj_ is not available and 
a-priori, so we can only use a reasonable estimate on sinZ(v,xj_) in (|38p . In the following, 
we will look into sinZ(v,xj_) and show that it is actually independent of the quality of 
the approximate eigenvector y, i.e., sinZ(y,x), and the subspace quality, i.e., sinZ(V,x). 
This means that sinZ(v, xj_) stays around some constant during outer iterations. Then we 
analyze its size, which is shown to be problem dependent and stay around some certain 
constant during outer iterations. Based on these results, we can propose a general practical 
selection of e. Obviously, in order to achieve a given r, the smaller sin Z(v, xj_) is, the smaller 
e must be and the more accurately we need to solve the inner linear system. 

We now investigate | cos Z(v, xj_ ) | and show that it is bounded independently of sin Z(y, x) 
and sinZ(V,x), so is sin Z(v, xj_). From ()10p and (|17p . it is known that v and (I — Pv)By 
are in the same direction. Therefore, from decomposition (|23p of y, we have 

[xfg - Py)By| 
|x ± ||||(I-P v )By|| 

xjf (I - Py)B(cos Z(y, x)x + sin Z(y, x)g) | 
||x ± ||||(I-P v )By|| 

x? (I - P V ) (^g^x + sin Z(y, x)Bg) 

||x ± ||||(I-P v )By|| 
cos Z(y, x) ||x_l || 2 + (A - cr) sin Z(y, x)x^Bg| 
|A-<7|||x ± ||||(I-P v )By|| 
cosZ(y,x)|||x ± || sinZ(y,x)|x^Bg| 
A - o-| ||(I - Pv)By|| ||x ± ||||(I - P v )By|| ' 



cos Z(v,xj_)| = 



< 
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Note that |x^Bg| < ||xj_ || ||Bg|| < ||x_l||||B|| and ||xj_|| = sinZ(V, x) < sinZ(y, x). So 
|cosZ(v,x ± )| < , ' COS ,f? T y,X ^" X t" ||+ sinZ ^ X ^ Bs " 



2||B||sinZ(y,x) , , 

< n t, ^ b • ( 39 ) 



Combining f)39j) and (I24p . we have 



A-a|||(I-P v )By|| ||(I-P v )By| 

' Z (y, x )l || R ,A sinZ(y,x) 

A-a| +l1 V ||(I-P v )By|| 
sinZ(y,:x 
I-P v )By| 



0(||B||)sinZ(y,x) . 

l~Av,^)l< 5^^4 = 0(1), (40) 

a seemingly trivial bound. However, the proof clearly shows that our derivation is general 
and does not miss anything essential. We are not able to make the bound essentially sharper 
and more elegant as the inequalities used in the proof cannot be sharpened generally. Never- 
theless, this is enough for our purpose. A key implication is that the bound is independent of 
sinZ(y,x) and sinZ(V,x), so j cos Z(v, xj_)j is expected to be around some constant during 
outer iterations, so is sin Z(v, xj_). 

It is possible to estimate sinZ(v, xj_) in some important cases. For the starting vector 
vi, it is known that the exact SIRA, SIA and JD methods work on the standard Krylov 
subspaces V = V m = ^m(B, vi) and V+ = V m +i = /C m+ i(B,vi). Here we have temporarily 
added iteration subscripts and assume that the current iteration step is m. It is direct from 
AMD to get 

m+1 

sinZ(V m+ i,x) = sinZ(vi,x) sin Z(vj, xjj, (41) 

where Vj, i = 2, 3, . . . , m+1 are exact subspace expansion vectors at steps i = 2, 3, . . . , m+1. 

For the Krylov subspaces V m and V m +i, there have been some estimates on sin Z(V m +i, x) 
in (5j[71[l5]. For B is diagonalizable, suppose all the Aj, i = 1,2, ...,n and a are real and 
-5-7— is also the algebraically largest eigenvalue of B, and define 

x^-xh (An-A 2 )(A- ( r) 



Then it is shown in 0[T5] that 



m+1 



sinZ(V m+ i,x) = sinZ(vi,x) sin Zfy^x^) < C Vl sinZ(vi,x) 



rj + y 7 ?? 2 - 1 



where C Vl is a certain constant only depending on vi and the conditioning of the eigensystem 
of B. So, ignoring the constant factor C V1 , we see the product FJ^ 1 sin Z(v,j, xj_) converges 
to zero at least as rapidly as 

/ \ rn 

I 1 \ 



\rj + \frf — 1 J 

As we have argued, all the sin Z ( Vj , xj_ ) , i = 2, 3, . . . , m + 1, stay around a certain constant. 
So basically, each step subspace improvement sin Z(vj, xj_), i = 2, 3, . . . , m + 1, behaves like 
and is no more than the factor 

1 

rj + ^Jrf - 1 ' 
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the average convergence factor for one step. Returning to our notation, we see the size of 
sinZ(v,xj_) crucially depends on the eigenvalue distribution. The better is separated 
from the other eigenvalues of B, the smaller sinZ(v,xj_) is. Conversely, if is poorly 
separated from the others, sinZ(v,xj_) may be near to one. For more complicated complex 
eigenvalues and/or a, quantitative results are obtained for sinZ(V m +i,x) and similar conclu- 
sions are drawn in [HIT]. However, we should point that these estimates may be conservative 
and also only predict linear convergence. In practice, a slightly superlinear convergence may 
occur sometimes, as has been observed in 

For r = 0.01, if sinZ(v,x_L) G [0.02,0.2], then by (PJ we have e G [10~ 4 ,10" 3 ]. Such 
sinZ(v,x_|_) means that -r 1 — is well separated from the other eigenvalues of B and the exact 
SIRA generally converges fast. In practice, however, for a given e we do not know the value of 
r produced by e as sin Z(v, xj_) and its bound are not known. For a given e, if we are unlucky 
to get a r not small like 0.01, the inexact SIRA may use more outer iterations than the exact 
SIRA. Suppose we select e = -4j— . Then if each sinZ(v, xj_) = 0.1, we get r = 0.01. For 
this case, we have a very good subspace V m for m = 10 since sin(Vio, x) < 10 -9 , so the exact 
SIRA generally converges very fast! For a real-world problem, however, one should not expect 
that is generally so well separated from the other eigenvalues that the convergence can 
be so rapid. Therefore, for real- world problems, we generally expect that e £ [10 _4 ,10~ 3 ] 
makes r < 0.01, so that the inexact SIRA and JD mimic the exact SIRA very well. 

Summarizing the above, we propose taking 

e £ [10~ 4 , 10~ 3 ]. (42) 

Our ultimate goal is to determine £s and £j for the inexact SIRA and JD. Compact 
bounds (f3~TI) and (f32j) show that they are generally of 0(e). However, it is impossible to 
compute the bounds cheaply and accurately. We will consider their practical estimates on £5 
and £j in Section EJ where we demonstrate that these estimates are cheaply obtainable. 

5 Restarted algorithms and practical stopping criteria for in- 
ner iterations 

Due to the storage requirement and computational cost, Algorithms [TH2] will be impractical 
for large steps of outer iterations. To be practical, it is necessary to restart them for difficult 
problems. Let M max be the maximum of outer iterations allowed. If the basic SIRA and 
JD algorithms do not converge, then we simply update vi and restart them. We call the 
resulting restarted algorithms Algorithms 3-4, respectively. 

In implementations, we adopt the following strategy to update Vi. For outer iteration 
steps i = 1, 2, . . . , M max during the current cycle, suppose (u[ , ) is the candidate for 
approximating the desired eigenpair (A, x) of A at the i-th outer iteration. Then we take 

Vi = y = arg min ||(A — I)y^ || (43) 

1=1, 2,..., M max 

as the updated starting vector in the next cycle. Such a restarting strategy guarantees that 
we use the best candidate Ritz vector in the sense of ()43|) to restart the algorithms. 

In what follows we consider some practical issues and design practical stopping criteria 
for inner iterations in the (non-restarted and restarted) inexact SIRA and JD algorithms. 

Given e, since L is not available, it is impossible to compute sep(^^:, L) and sep(y^By, L) 
in (|3ip and (|32|) . Also, we cannot compute sinZ(V, f) in (|3ip and (|32p . In practice, we simply 
replace the insignificant factor sinZ(V, f) by one, which makes £3 and £j as small as possible, 
so that the inexact SIRA and JD algorithms are the safest to mimic the exact SIRA. We 
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replace ||B|| by . . in the inexact SIRA and JD, respectively. For sep(^^,L), we can 
exploit the spectrum information of H to estimate it. Let ^, i = 2,3,...,m be the other 
eigenvalues (Ritz values) of H other than v. Then we use the estimate 

(44) 



sep I , L ~ min 

— a J i=2,3,. ...m 



CJ 



Note that it is very expensive to compute y^By but y^By « ■^p. So we simply use to 
estimate sep (y^By, L) . With these estimates and taking the equalities in compact bounds 
([3D and (JMJ) , we get 

_ Vi — a 

es = ej = e = 2e max 



j=2,3,. 



(45) 



It might be possible to have e > 1 for a given e. This would make u no accuracy as an 
approximation to u. As a remedy, from now on we set 



e = min{e,0.1}. (46) 

For m = 1, we simply set e = e. 

Note that nnjp is a-priori i 
is below e or not. However, it is easy to verify that 



Note that nnjp is a-priori and uncomputable. We are not able to determine whether it 



1 llu — u|| lire — (A — oT)u|| Ilu — ull , , . 

^ Ti — n — - K ( B ) ( 47 ) 



k(B) ||u|| ll r s|| ll u 

and 

1 llu -ull II - rj - (I - yy H )(A - at) (I - yy H )u|| , . Jlu - ull . N 

— - " „ „ " < i * y , „ yy ' " < kCB )- — - — - — -, (48) 

k(B') ||u|| ~ ||rj|| ~ v ' ||u|| ' v ; 

where u ± y and B' = B| ± = (A — al)~ 1 \ y ±, the restriction of B to the orthogonal 
complement of span{y}. Alternatively, based on the above two relations, in practice we 
require that inner solves stop when the a-posteriori computable relative residual norms 

||r s - (A - alM ^ £ (49) 



ll r 5| 

and 



rj - (I - yy")(A - < 7l)(I-yyg)u|| ^ g 



PjW 

for the inexact SIRA and JD, respectively. 

Remark. In [3lll6|[T7]. a-priori accuracy requirements have been determined for inner 
iterations in SIA type methods. In computation, a-posteriori residuals are intuitive, and are 
probably the only practical way to approximate the a-priori residuals. Here, by the above 
lower and upper bounds (|4T|) and ([4*81) that relate the a-posteriori relative residuals to the 
a-priori errors of approximate solutions, we have simply demonstrated that (|49p and (|50p 
are reasonable stopping criteria for inner solves. We see that the a-priori errors and the 
a-posteriori errors are definitely comparable once the linear systems are not ill conditioned. 



6 Numerical experiments 

We report numerical experiments to confirm our theory. Our aims are mainly three-fold: 
(i) Regarding outer iterations, for fairly small e = and 10~ 4 , the (non-restarted and 
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restarted) inexact SIRA and JD behave very like the (non-restarted and restarted) exact 
SIRA. Even a bigger e = 10 -2 often works very well, (ii) Regarding inner iterations and 
overall efficiency, the inexact SIRA and JD algorithms are considerably more efficient than 
the inexact SIA. (iii) SIRA and JD arc similarly effective. 

All the numerical experiments were performed on an Intel (R) Core (TM)2 Quad CPU 
Q9400 2.66GHz with main memory 2 GB using Matlab 7.8.0 with the machine precision 
Cmach = 2.22 x 10 -16 under the Microsoft Windows XP operating system. 

At the mth step of the inexact SIRA or JD method, we have H m = V^AV m . Let 
fai i = 1, 2, . . . , m be the eigenpairs of H m , which are ordered as 

i ( m ) i ^ i ( m ) i ^ ^ i (m) i 
\v\ — a\ < |^2 — cr\ < ■ ■ ■ < ' — a\. 

We use the Ritz pair y m to approximate the desired eigenpair (A, x) 

of A, and the associated residual is r m = Ay 
We stop the algorithms if 



km || < tol = max {||A||i, 1} x 10" 
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In the inexact SIRA and JD, we stop inner solves when (|4"9|) and (f50|) are satisfied, respectively, 
and denote by SIRA(e) and JD(e) the inexact SIRA and JD algorithms with the given 
parameter e. We use the following stopping criteria for inner iterations in the exact SIRA 
and SIA algorithms and the inexact SIA algorithm. 

• For the "exact" SIA and SIRA algorithms, we require the approximate solution u m +i 
to satisfy 

II / * t\~ II II 1 "™ ~ ( A ~ 0-I)5m+l|| , ln -l4 
||v m -(A-<Tl)u m+ i||, j| |i < 10 L \ 

In Tn || 

respectively. 

• For the inexact SIA algorithm, we use the stopping criterion (3.14) in [3], where we 
take the same outer iteration tolerance max{||A||i, 1} x 10 -12 and the steps m suitably 
bigger than the number of outer iterations used by the exact SIRA so as to ensure the 
convergence of the inexact SIA with the same accuracy. For the restarted inexact SIA, 
we take m the maximum outer iterations M max allowed for each cycle. 

In the numerical experiments, we always take the zero vector as an initial approximate 
solution to each inner linear system and solve it by the right-preconditioned non-restarted 
GMRES method. Outer iterations start with the normalized vector lj ■ ■ ■ , l) H ■ F° r the 

correction equation in the JD method, we use 

M m = (I - y m y„)M(I - y m y^) 



as a preconditioner, which is suggested in [21] . Here M ~ A — al is some preconditioner used 
for all the inner linear systems involved in the algorithms tested except JD. 

In all the tables below, we denote by I ou ter the number of outer iterations to achieve 
the convergence, by Iinner the total number of inner iterations and by Iq.i the times that 
e = 0.1 appears. Note that Iinner is a reasonable measure of the overall efficiency of all the 
algorithms used in the experiments. For Examples 1-3 we test Algorithms [THll the inexact 
SIA and exact SIRA; for Example 4 we test these algorithms and their restarted versions. 

Example 1. This problem is a large nonsymmetric standard eigenvalue problem of cry 10000 
of n = 1000 that arises from the stability analysis of a crystal growth problem from [I]. We 
are interested in the eigenvalue nearest to a = 7. The computed eigenvalue is A ~ 6.7741. 
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Method 


SIRA(1(T 2 ) 


JD(10~ 2 ) 


SIRA(1(T 3 ) 


JD(1(T 3 ) 


inexact SIA 


exact SIRA 


louter 


14 


14 


15 


15 


16 


13 


linner 


44 


43 


71 


70 


173 


278 


lo.i 



















Table 1: Example 2. crylOOOO with a = 7. 




Figure 1: Example 2. crylOOOO with a = 7. Left: outer residual norms versus outer iterations. 
Right: the numbers of inner iterations versus outer iterations. 

The preconditioner M is obtained by the incomplete LU factorization of A — al with drop 
tolerance 0.001. Tableland Figure [1] describe the results and convergence processes. 

We see from Table Q] and Figure Q] that for both e = 10~ 2 , 10~ 3 the inexact SIRA, JD 
and SIA behaved like the exact SIRA very much and used almost the same outer iterations. 
Clearly, smaller e is not necessary as it cannot reduce outer iterations anymore. 

Regarding the overall efficiency, the exact SIRA was obviously the most expensive method. 
It used 22 ~ 25 inner iterations per outer iteration. The inexact SIA was still the second most 
expensive method. The numbers of inner iterations were comparable and between 15 ~ 17 at 
each of the first 8 outer iterations where the accuracy of approximate eigenpairs was poor and 
the inner linear systems must be solved with high accuracy. As the approximate eigenpairs 
started converging, the relaxation strategy came into picture and the inner linear systems 
were solved with decreasing accuracy, leading to fewer inner iterations at subsequent outer 
iterations. Inner iterations used by the inexact SIA were only comparable to and finally 
below those used by the inexact SIRA and JD in the last very few iterations. In contrast, the 
figure indicates that, for the same e, the inexact SIRA and JD solved the linear systems with 
almost the same inner iterations per outer iteration. Because of this, the inexact SIRA and 
JD were much more efficient than the inexact SIA and used much fewer inner iterations than 
the latter. Table Q] shows that they were twice to four times as fast as the inexact SIA, and 
SIRA(10 -2 ) and JD(10 -2 ) was considerably more efficient than SIRA(10~ 3 ) and JD(10 -3 ). 
Finally, we mention that the inexact SIRA and JD were equally effective, as indicated by the 
numbers of inner iterations used for each e. 

Example 2. We consider the unsymmetric sparse matrix sherman5 of n = 3312 that has 
been used in (3j[l6] for testing the relaxation theory with a = 0. The computed eigenvalue 
is A ~ 4.6925 x 10~ 2 . The preconditioner M is obtained by the incomplete LU factorization 
of A — <rl with drop tolerance 0.001. Table [2] reports the results obtained, and the left and 
right parts of Figure [2] depict the convergence curve of ||r m || versus I ou ter an d the curve of 
linner versus I uter for the algorithms, respectively. 

We see from the left part of Figure [2] that the inexact SIRA, JD and SIA behaved like 



16 



Method 


SIRA(1(T 2 ) 


JD(10~ 2 ) 


SIRA(10~ 3 ) 


JD(10^ 3 ) 


inexact SIA 


exact SIRA 


louter 


10 


10 


9 


8 


9 


8 


linner 


71 


42 


84 


45 


125 


168 


lo.i 



















Table 2: Example 1. sherman5 with a = 0. 




Figure 2: Example 1. sherman5 with a = 0. Left: outer residual norms versus outer 
iterations. Right: the numbers of inner iterations versus outer iterations. 

the exact SIRA very much and used almost the same outer iterations. They mimic the 
exact SIRA better for e = 10" 3 than for e = 10" 2 . The fi gure also tells us that a smaller 
e < 10~ 3 is definitely not necessary as it could not reduce the number of outer iterations and 
meanwhile would consume more inner iterations. The results confirm our theory and indicate 
that our selection of e and e worked very well. It is obvious that, as far as outer iterations 
are concerned, all the algorithms converged quickly and smoothly. 

For the overall efficiency, the situation is very different. As is expected, we see from 
Table [2] and Figure [2] that the exact SIRA was the most expensive and the inexact SIA was 
the second most expensive. The exact SIRA used 24 inner iterations per outer iteration, 
and the inexact SIA used 18 ~ 20 inner iterations at each of the first 4 outer iterations 
where the accuracy of approximate eigenpairs was poor and the inner linear systems must be 
solved with high accuracy. As the approximate eigenpairs started converging, the relaxation 
strategy took effect and the inner linear systems were solved with decreasing accuracy, so that 
the numbers of inner iterations became increasingly smaller as outer iterations proceeded. In 
contrast, the inexact SIRA and JD were much more efficient than the inexact SIA, they used 
much fewer inner iterations than the latter and were one and a half times to three times as 
fast as the inexact SIA. Furthermore, we observe that the inexact JD and SIRA used quite 
few and almost constant inner iterations per outer iteration for each e, respectively, but the 
former was more effective than the latter. This may be due to the better conditioning of the 
coefficient matrix in the correction equation of JD. 

Example 3. This problem arises from computational fluid dynamics and the test matrix 
af23560 of 23560 is from transient stability analysis of Navier-Stokes solvers pp. We want 
to find the eigenvalue nearest to a = 0. The computed eigenvalue is A ~ —0.2731. The 
preconditioner M is obtained by the incomplete LU factorization of A — <rl with drop tolerance 
0.1; see Tableland Figure [3] for the results. 

First, we see from both Tableland Figure [3] that this problem was considerably more 
difficult than the previous two ones since all the algorithms used more outer iterations and 
much more inner iterations to achieve the prescribed convergence accuracy. 
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Method 


SIRA(1(T 2 ) 


JD(10~ 2 ) 


SIRA(10~ 3 ) 


JD(10~ 3 ) 


inexact SIA 


exact SIRA 


louter 


42 


40 


30 


31 


28 


27 


linner 


2289 


2217 


2563 


2622 


6884 


9173 


lo.i 


25 


20 













Table 3: Example 3. af23560 with a = 0. 




Figure 3: Example 3. af23560 with a = 0. Left: outer residual norms versus outer iterations. 
Right: the numbers of inner iterations versus outer iterations. 

For this example, the case that e = 0.1 occurred at about half of outer iterations in 
SIRA(10 -2 ) and JD(10~ 2 ). Regarding outer iterations, we observe from Figure [3] that the 
inexact SIA behaved like the exact SIRA very much for e = 10~ 3 and the inexact SIRA, JD 
and SIA exhibited similar convergence behavior to the exact SIRA and used comparable outer 
iterations. For the bigger e = 10 -2 , the inexact SIRA and SIA used more outer iterations 
and did not mimic the exact SIRA well. Again, the results confirmed our theory, showing 
that a low or modest accuracy e = 10~ 3 is enough and a somehow poorly chosen e = 10 -2 
worked well but the inexact SIRA and JD may need considerably more outer iterations. 

For the overall efficiency, the inexact SIA was better than the exact SIRA but much 
inferior to the inexact SIRA and JD. Actually, the inexact SIRA and JD were three times 
as fast as the inexact SIA. Although SIRA(10~ 2 ) and JD(10 -2 ) used more outer iterations 
than SIRA(10~ 3 ) and JD(10 -3 ), they were more efficient than the latter ones in terms of 
total numbers of inner iterations. The exact SIRA used roughly 350 inner iterations per 
outer iteration. The inexact SIA used many inner iterations and needed to solve inner 
linear systems with high accuracy for most of the outer iterations. Even after the relaxation 
strategy played a role, it still used much more inner iterations than the inexact SIRA and 
JD at each outer iteration. We find that, for the same accuracy e, the inexact SIRA and JD 
solved the linear systems with slowly varying inner iterations at each outer iteration. Table [3] 
demonstrates that the inexact SIRA and JD had very similar efficiency. 

Example 4. This unsymmetric eigenvalue problem dw8192 of 8192 arises from dielectric 
channel waveguide problems pp. We are interested in the eigenvalue nearest to the complex 
target a = O.Oli. The computed eigenvalue is A « 3.3552 x 10~ 3 + 1.1082 x 10~ 3 i. The 
preconditioner M is obtained by the incomplete LU factorization of A — <rl with drop tolerance 
0.001. Table [J] displays the results. 

As far as the eigenvalue problem is concerned, Table H] clearly indicates that this problem 
is much more difficult than Examples 1-3 since all the algorithms used much more outer 
iterations to achieve the convergence than those needed for Examples 1-3. But our inexact 
SIRA and JD algorithms still worked very well. For e = 10~ 3 , the inexact SIRA and JD 
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Method 


SIRA(1(T 3 ) 


JD(10~ 3 ) 


SIRA(10~ 4 ) 


JD(10" 4 ) 


inexact SIA 


exact SIRA 


louter 


120 


138 


101 


101 


104 


101 


linner 


472 


559 


622 


633 


2259 


2999 


lo.i 





2 













Table 4: Example 4. dw8192 with a = O.Oli. 



behaved like the exact SIA very well and used comparable outer iterations as the exact 
SIRA did. For the smaller e = 10 -4 , the inexact SIRA and JD used the exactly the same 
outer iterations as the exact SIRA. Furthermore, we have observed that SIRA(10 -4 ) and 
JD(10 -4 ) behaved like the exact SIRA very accurately and their convergence curves were 
almost indistinguishable from that of the exact SIRA. 

For the overall efficiency, Table 0] exhibited similar features to those in all the previous 
tables for Examples 1-3. The inexact SIRA and JD were similarly effective, and the former 
performed slightly better than the latter. Both them were much more efficient than the 
inexact SIA and actually four to five times as fast as the latter. 

Since this problem is difficult, we turn to use restarted SIRA and JD algorithms, Algo- 
rithms 3-4, to solve it with the maximum M max = 30 outer iterations allowed during each 
cycle. Table [5] lists the results obtained by the restarted inexact SIRA, JD and SIA as well 
as the restarted exact SIRA by taking e = 10~ 3 , 10~ 4 , where Irestart denotes the number of 
restarts used, i.e., the number of the cycles of Algorithms 1-2 for the given M max . Figure 0] 
depicts the convergence curve of all the restarted algorithms and the curve of linner versus 
Irestart, m which the zeroth restart in abscissa denotes the first cycle of Algorithms 3-4 and 
corresponds to the first restart in the left figure. 



Method 


SIRA(1(T 3 ) 


JD(10~ 3 ) 


SIRA(10~ 4 ) 


JD(10" 4 ) 


inexact SIA 


exact SIRA 


Irestart 


8 


7 


5 


5 


5 


5 


louter 


238 


198 


131 


131 


125 


129 


linner 


972 


761 


726 


731 


2442 


3641 


lo.i 



















Table 5: Example 4. Restarted algorithms with M max — 30. 





Figure 4: Example 4. Restarted algorithms with M max = 30. Left: outer residual norms 
versus restarts of outer iterations. Right: the numbers of inner iterations versus restarts. 

It is seen from Table [5] and the left part of Figure 0] that all the algorithms solved the 
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problem very successfully with no more than eight restarts used and the convergence processes 
were smooth. For e = 10 -3 , the restarted SIRA and JD used a little more restarts to achieve 
the convergence. For e = 10" 4 , it is impressive that the restarted SIRA and JD algorithms 
behaved like the restarted exact SIRA and inexact SIA very much and they used exactly the 
same five restarts as the latter two algorithms. Furthermore, it is seen from the left figure that 
their convergence curves of the restarted SIRA(10~ 4 ) and JD(10~ 4 ) almost coincided with 
that of the exact SIRA. We also find that, compared with TableHl the restarted SIRA(10 -4 ), 
JD(10 -4 ) and exact SIRA performed excellently since Iouter's used by them were very near 
to the ones by their corresponding non-restarted versions, respectively. 

Regarding the overall performance, for given e = 10 -3 and 1CT 4 , the restarted SIRA and 
JD algorithms performed very similarly and were about three times as fast as the restarted 
inexact SIA. During the last cycle, the restarted SIRA has already achieved the convergence at 
the tenth outer iteration. So we stopped the algorithm at that step and solved only nine inner 
linear systems by costing only 232 inner iterations. During each of the first four cycles, the 
restarted exact SIRA used almost constant inner iterations to solve twenty-nine inner linear 
systems. This is why, in the right part of Figure 01 the curve for the restarted exact SIRA 
is almost parallel to the abscissa with the first four restarts and then falls abruptly at last 
restart. As is expected, the restarted inexact SIRA and JD algorithms used almost constant 
inner iterations for the same e per restart, while the inexact SIA used fewer and fewer inner 
iterations as outer iterations converged. The figure clearly shows that the restarted inexact 
SIA used much more inner iterations than the restarted SIRA(10~ 4 ) and JD(10~ 4 ) at each 
of the first four cycles. 

We have tested some other problems. We have also tested the algorithms when tuning is 
applied to our preconditioner M [3]. All of them have shown that the inexact SIRA and JD 
mimic the inexact SIA and the exact SIRA very well for e = 10~ 3 , 10~ 4 and use much fewer 
inner iterations than the inexact SIA. As far as the overall efficiency is concerned, SIRA(10~ 2 ) 
and JD(10 -2 ) may work well and often use fewer inner iterations than SIRA(10 -3 ) and 
JD(10 -3 ), but they are likely to need considerably more outer iterations and cannot mimic 
the exact SIRA well. Therefore, we propose using s G [10 -4 , 10 -3 ] in practice. We have 
found that the tuned preconditioning has no advantage over the usual preconditioning and is 
often inferior to the latter for the linear systems involved in the inexact SIRA, JD and SIA 
algorithms. For example, we have found that for Example 3 the tuned preconditioning used 
about three times inner iterations more than the usual preconditioning. 

7 Conclusions and future work 

We have quantitatively analyzed the convergence of the SIRA and and JD methods and proved 
that one only needs to solve all the inner linear systems involved in them with low or modest 
accuracy. Based on the theory established, we have designed practical stopping criteria for 
the inexact SIRA and JD. Numerical experiments have illustrated that our theory works very 
well and the non-restarted and restarted inexact SIRA and JD algorithms behave very like 
the non-restarted and restarted exact SIRA algorithms. Meanwhile, we have confirmed that 
the inexact SIRA and JD algorithms are similarly effective and both them are much more 
efficient than the inexact SIA algorithms. 

It is well known that the (inexact) JD method with variable shifts is used more com- 
monly. The analysis approach used in our paper may be extended to analyze the accuracy 
requirement on inner iterations in the JD method with variable shifts and a rigorous general 
theory is expected. This work is in progress. 

Since the harmonic projection may be more suitable to solve the interior eigenvalue prob- 
lem, it is significant to consider the harmonic version of SIRA. Moreover, it is known that 



20 



the standard projection, i.e., the Rayleigh-Ritz method, and its harmonic version may have 
convergence problem when computing eigenvectors [HE]. So it is worthwhile to use the re- 
fined Rayleigh-Ritz procedure [6j[9] and the refined harmonic version [9] for solving the large 
eigenproblem considered in this paper. These constitute our future work. 

References 

[1] Z. Bai, R. Barret, D. Day, J. Dem mel, and J. Dongarra, Test matrix collection for 
non-Hermitian eigenvalue problems, http://math.nist.gov/MatrixMarket/ 

[2] Z. Bai, J. Demmel, J. DONGARRA, A. Ruhe, and H. van der Vorst, Templates for the 
Solution of Algebraic Eigenvalue Problems: A Practical Guide, SIAM, Philadelphia, PA, 2000. 

[3] M. A. Freitag and A. Spence, Shift- and-invert Arnoldi's method with preconditioned iterative 
solvers, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 942-969. 

[4] M. E. Hochstenbach and Y. Notay, Controlling inner iterations in the Jacobi-Davidson 
method, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 460-477. 

[5] Z. JlA, The convergence of generalized Lanczos methods for large unsymmetric eigenproblems, 
SIAM J. Matrix Anal. Appl., 16 (1995), pp. 843-862. 

[6] , Refined iterative algorithms based on Arnoldi's process for unsymmmetric eigenproblems, 

Linear Algebra Appl., 259 (1997), pp. 1-23. 

[7] , Generalized block Lanczos methods for large unsymmetric eigenproblems, Numer. Math., 

80 (1998), pp. 239-266. 

[8] , The convergence of harmonic Ritz values, harmonic Ritz vectors and refined harmonic Ritz 

vectors, Math. Comput., 74 (2005), pp. 1441-1456. 

[9] Z. JlA and G. W. Stewart, An analysis of the Rayleigh-Ritz method for approximating 
eigenspaces, Math. Comput., 70 (2001), pp. 637-648. 

[10] C. Lee, Residual Arnoldi method: theory, package and experiments, Ph.D thesis, TR-4515, De- 
partment of Computer Science, University of Maryland at College Park, 2007. 

[11] C. Lee and G. W. Stewart, Analysis of the residual Arnoldi method, TR-4890, Department 
of Computer Science, University of Maryland at College Park, 2007. 

[12] R. B. MORGAN, Implicitly restarted GMRES and Arnoldi methods for nonsymmetric systems of 
eguations, SIAM J. Matrix Anal. Appl., 21 (2000), pp. 1112-1135. 

[13] Y. Notay, Combination of Jacobi-Davidson and conjugate gradients for the partial symmetric 
eigenproblem, Numer. Linear Algebra Appl., 9 (2002), pp. 21-44. 

[14] B.N. Parlett, The Symmetric Eigenvalue Problem, SIAM, Philadelphia, PA, 1998. 

[15] Y. Saad, Numerical Methods for Large Eigenvalue Problems, Manchester University Press, UK, 
1992. 

[16] V. Simoncini, Variable accuracy of matrix-vector products in projection methods for eigencom- 
putation, SIAM J. Numer. Anal., 43 (2005), pp. 1155-1174. 

[17] V. Simoncini and D. B. Szyld, Theory of inexact Krylov subspace methods and applications 
to scientific computing, SIAM J. Sci. Comput., 25 (2003), pp. 454-477. 

[18] G. Sleijpen and H. Van der Vorst, A Jacobi-Davidson iteration method for linear eigenvalue 
problems, SIAM J. Matrix Anal. Appl., 17 (1996), pp. 401-425. Reprinted in SIAM Review, 
(2000), pp. 267-293. 

[19] A. STATHOPOULOS, Nearly optimal preconditioned methods for Hermitian eigenproblems under 
limited memory. Part I: Seeking one eigenvalue, SIAM J. Sci. Comput., 29 (2007), pp. 2162-2188. 

[20] G. W. Stewart, Matrix Algorithms Vol II: Eigensystems, SIAM, Philadelphia, PA, 2001. 



21 



[21] H. van der VORST. Computational Methods for Large Eigenvalue Problems, Elsevier, North 
Hollands, 2002. 



[22] H. VOSS, A new justification of the Jacobi-Davidson method for large eigenproblems, Linear 
Algebra Appl., 424 (2007), pp. 448-455. 

[23] F. Xue AND H. Elman, Fast inexact implicitly restarted Arnoldi method for generalized eigen- 
value problems with spectral transformation. Technical Report, Department of Computer Science, 
University of Maryland, 2010. 



22 



